
APPARATUSES AND METHODS FOR USE IN CREATING AN AUDIO SCENE 
FIELD OF THE INVENTION 



apparatuses and methods £or use in creating an audio scene, 
and has particular - but by no means exclusive - 
application for use in creating an audio scene £or a 
virtual environment. 

10 

BACKGROUND OF THE INVENTION 

There have been significant advances in creating 
visually immersive virtual environments in recent years. 

15 These adveuices have resulted in the widespread uptake of 
massively multi-player role-playing games, in which 
participants can enter a common virtual environment (such 
as a battlefield) and are represented in the virtual 
environment by an avatar, which is typically in the form of 

20 an auiimated character. In the case of a virtual 

environment in the form of a battle field that avatar could 
be of a soldier. 



25 environments is due in part to significant advances in 
image processing technology that enables highly detailed 
and realistic graphics virtual environment to be generated. 
The proliferation of three-dimensional sound cards provides 
the ability to supply participants in a virtual environment 

30 with high quality sound. However, despite the prolific use 
of three-dimensional so\ind cards today's visually immersive 
virtual environments are generally unable to provide 
realistic mechanisms for participants to communicate with 
each other. Many environments use non- immersive 

35 communication mechcmisms such as text based chat or walkie- 
talkie style voice. 



5 



The present invention relates generally to 



The widespread uptake of visually immersive virtual 
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DEFINITIONS 

5 The following provides definitions for various terms 

used throughout this specification: 

• Weighted audio stream - audio information that comprises 
one or xnore pieces of audio information, each of which 

10 has an amplitude that is modified (increased or 

decreased) based on a distance between a source euid 
recipient of the audio information. 

• Unweighted audio stream - audio information that 

15 comprises one or more pieces of audio information, but 

unlike a weighted audio stream the amplitude of each 
piece of audio information in an unweighted audio stream 
is un-modified from the original amplitude. 

20 • Audio Scene - audio information comprising combined 

sounds (for example, voices belonging to other avatars 
and other sources of sound within the virtual 
environment) that are spatially placed and perhaps 
attenuated according to a distance between a source and 

25 recipient of the sound. An audio scene may also comprise 

sound effects that represent the acoustic 
characteristics of the environment. 

SUMMARY OF THE INVENTION 

30 

According to a first aspect of the present invention 
there is provided an apparatus for creating an audio scene 
for an avatar in a virtual environment, the apparatus 
coxqprising: 

35 an audio processor operable to create a weighted 

audio stream that comprises audio from an object located in 
a portion of a hearing range of the avatar; cuid 
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associating means operable to associate the weighted 
audio stream with a datum that represents a location o£ the 
portion o£ the hearing range in the virtual environment, 
wherein the weighted audio stream and the dattim represent 
5 the audio scene. 

The apparatus according to the first aspect o£ the 
present invention has several advantages. One advantage is 
that by dividing the hearing range in to one or more 

10 portions, the fidelity of the audio scene can be adjusted 
to a required level. The greater the number of portions in 
the hearing range, the higher the fidelity of the audio 
scene. It is envisaged that the apparatus is not restricted 
to a single weighted audio stream for one portion. In fact, 

15 the apparatus is capable of multiple weighted audio streams 
each coznprising audio from an object located in other 
portions of the hearing range. Another advantage of the 
apparatus is that the weighted audio stream can replicate 
characteristics such as attenuation of the audio as a 

20 result of having to travel a distance between the object 
and the recipient. Yet another advantage of the present 
invention is that the audio stream can be reproduced as if 
it emanated from the location. Thus, if the datum indicated 
that the location of the object was to the right hand side 

25 of the recipient, the audio could be reproduced using the 
right channel of a stereo sound system. 

Preferably, the audio processor is further operable 
to create the weighted audio stream such that it comprises 
30 an unweighted audio stream that comprises audio from 

another object located in the portion of the hearing rauige 
of the avatar. 

An advantage of including the unweighted audio stream 
35 in the weighted audio stream is that it provides a means 
for representing audio from one or more other objects that 
are located at the periphery of the portion of the hearing 
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range o£ the avatar. An advantage o£ the unweighted audio 
stream is that it can be reused for creating audio scenes 
o£ many avatars, which can reduce the overall processing 
requirements £or creating the audio scene. 

5 

Preferably, the audio processor is operable to create 
the weighted audio stream in accordance with a 
predetermined mixing operation, the predetermined mixing 
operation comprising identification information that 
10 identifies the object and/or the other objects, and 
weighting information that can be used by the audio 
processor to set an amplitude of the audio and unweighted 
audio stream in the weighted audio stream. 

15 Preferably, the apparatus further comprises a 

communication means operable to receive the audio, the 
unweighted audio stream and the mixing operation via a 
communication network, the communication means further 
being oper€J3le to send the weighted audio stream and the 

20 datum via the communication network. 

Using the communication means is advantageous because 
it enedsles the apparatus to be used in a distributed 
environment . 

25 

According to a second aspect of the present 
invention, there is provided an apparatus operable to 
create audio information for use in an audio scene for an 
avatar in a virtual environment, the apparatus comprising: 

30 an audio processor operable to create an unweighted 

audio stream that comprises audio from an object located in 
a portion of a hearing range of the avatar; and 

associating means oper£d3le to associate the 
unweighted audio stream with a datum that represents an 

35 approximate location of the object in the virtual 

environment, wherein the unweighted audio stream €uid the 
datum represent the audio information. 
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The apparatus according to the second aspect of the 
present Invention has several advantages, two of which are 
similar to the aforementioned first and second advantages 
5 of the first aspect of the present invention. 

Preferably, the audio processor is operable to create 
the unweighted audio stream in accordance with a 
predetermined mixing operation, the predetermined mixing 
10 operation comprising identification information that 
identifies the object. 

Preferably, the apparatus further comprises a 
communication means operable to receive the audio and the 
15 predetermined mixing operation via a communication network, 
the communication means also being opereJ^le to send the 
unweighted audio stream euid the datxim via the communication 
network. 

20 Using the communication means is advantageous because 

it enables the apparatus to be used in a distributed 
environment . 

According to a third aspect of the present invention 
25 there is provided an apparatus for obtaining information 
that can be used to create an audio scene for an avatar in 
a virtual environment, the apparatus comprising: 

identifying means operable to determine an identifier 
of an object located in a portion of a hearing rcmge of the 
30 avatar; 

weighting means opsrablB to determine a weighting to 
be applied to audio from the object; and 

locating means operable to determine a location of 
the portion in the virtual environment, wherein the 
35 identifier, weighting and the location represent the 

information that can be used to create the audio scene. 
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The ability of the third aspect o£ the present 
invention to obtain the weighting and the location is 
advantageous for several reasons. First, the weighting can 
be used to create a weighted audio stream that comprises 
5 the audio from the object. In this regard, the weighting 
can be used to set an amplitude of the audio when inserted 
into the weighted audio stream. Second, the location can be 
used to reproduce the audio as if it were coming from the 
location. For example, if the location indicated that the 
10 location of the object was to the right hand side of the 
recipient, the audio could be reproduced using the right 
channel of a stereo sound system. 



Prefer6U3ly, the apparatus further comprises a 
15 communication means operable to send, via a communication 
network, the identifier, the weighting and the location to 
one of a plurality of systems for processing. 

Using the communication means is advantageous because 
20 it enables the apparatus to be used in a distributed 

environment. Furthermore, it enables the apparatus to send 
the identifier, the weighting emd the location to a system 
that has the necessary resources (processing ability) to 
perform the required processing. 

25 

Preferably, the communication means is further 
operable to create routeing information for the 
communication network, wherein the routeing information is 
such that it can be used by the communication network to 
30 route the audio to the one of the plurality of system for 
processing. 

Being able to provide the routeing information is 
advantageous because it allows the apparatus to effectively 
35 select the links in the communications network that will be 
used to transfer the audio. 
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Preferably, the identifying me€Uis, the weighting 
means and the locating means are oper€J3le to respectively 
determine the identifier, the weighting and the location by 
processing a representation of the virtue environment. 

5 

Preferably, the identifying means is operable to 
determine the portion of the hearing range by: 

selecting a first of a plurality of avatars in the 
virtual environment; 
10 identifying a second of the plurality of avatars that 

is proximate the first of the avatars; 

determining whether the second of the avatars can be 
^ included in an existing cluster; 

including the second of the avatars in the existing 
15 cluster upon determining that it can be included therein; 

creating a new cluster that includes the second of 
the avatars upon determining that the second of the avatars 
caumot be included in the existing cluster to thereby 
create a plurality of clusters; 
20 determining an angular gap between two of the 

clusters; 

creating a further cluster that is substantially 
located in the angular gap; and 

including at least one of the avatars in the further 
25 cluster. 

Alternatively, the identifying means is opereJsle to 
determined the portion of the hearing range by: 

selecting one of a plurality of avatars in the 
30 virtual environment; 

determining a radial ray that extends from the avatar 
to the one of the plurality of avatars; 

calculating the absolute angular distance that each 
of the plurality of avatars is from the radial ray; 
35 arranging the absolute angular distance of each of 

the avatars into an ascending ordered list; 

calculating a differential angular separation between 
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successive ones o£ the absolute angular distance In the 
ascending ordered list; 

selecting at least one of the differential angular 
separation that has a higher value than another 
5 differential angular separation; and 

determining another radial ray that emanates from the 
avatar and which bisects two of the avatars that are 
associated with the at least one of the differential 
angular separation. 

10 

According to a fourth aspect of the present Invention 
there is provided an apparatus for creating information 
that can be used to create €ui audio scene for an avatar in 
a virtual environment, the apparatus comprising: 
15 identifying means operable to determine an identifier 

of an object located in a portion of a hearing range of the 
avatar; and 

locating means operable to determine an approximate 
location of the object in the virtual environment, wherein 
20 the identifier and the approximate location represent the 
information that can be used to create the audio scene. 



Determining the approximate location of the object is 
advantageous because it can be used to reproduce audio from 
25 the object as if it were emanating from the location. 

Preferably, the apparatus further coz^prises a 
communication means operable to send, via a communication 
network, the identifier €uid the location to one of a 
30 plurality of systems for processing. 

Using the communication means is advantageous because 
it enables the apparatus to be used in a distributed 
environment. Furthermore, it enables the apparatus to send 
35 the identifier, the weighting and the location to a system 
that has the necessary resources (processing ability) to 
perform the required processing. 
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Preferably, the coxnnninication zneeuis is further 
operable to create routeing information for the 
communication network, wherein the routeing information is 
5 such that it can be used by the communication network to 
route the audio to the one of the plurality of systems for 
processing. 

Being able to provide the routeing information is 
10 advantageous because it allows the apparatus to effectively 
select the links in the communication network that will be 
used to transfer the audio. 

Preferably, the identifying means and the locating 
15 means are operable to respectively determine the identifier 
and the location by processing a representation of the 
virtual environment. 

Preferably, the identifying means is operable to 
20 determine the approximate location of the object by: 

dividing the virtual environment into a plurality of 
cells; cuid 

determining a location in one of the cells about 
which the object is located. 

25 

According to a fifth aspect of the present invention 
there is provided an apparatus for rendering an audio scene 
for an avatar in a virtual environment, the apparatus 
comprising: 

30 obtaining means operable to obtain a weighted audio 

stream that comprises audio from an object located in a 
portion of a hearing range of the avatar, and a datum that 
is associated with the weighted audio stream and which 
represents a location of the portion of the hearing range 

35 in the virtual environment; and 

a spatial audio rendering engine that is operable to 
process the weighted audio stream and the datum in order to 
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render the audio scene. 

According to a sixth aspect o£ the present invention 
there is provided a method o£ creating an audio scene for 
an avatar in a virtual environment, the method concprising 
the steps o£: 

creating a weighted audio stream that comprises audio 
£rom an object located in a portion o£ a hearing remge of 
the avatar; and 

associating the weighted audio stream with a datxim 
that represents a location of the portion of the hearing 
range in the virtual environment, wherein the weighted 
audio stream and the datum represent the audio scene. 

Preferably, the step of creating the weighted audio 
stream is such that the weighted audio stream comprises an 
unweighted audio stream that comprises audio from emother 
object located in the portion of the heeuring range of the 
avatar. 

Preferably, the step of creating the weighted audio 
stream is carried out in accordance with a predetermined 
mixing operation, the predetermined mixing operation 
comprising identification information that identifies the 
object and/or the other objects, and weighting information 
that can be used by the audio processor to set an amplitude 
of the audio and unweighted audio stream in the weighted 
audio stream. 

PrefereQ^ly, the method further comprises the steps 

of: 

receiving the audio, the unweighted audio stream and 
the mixing operation via a communication network; and 

sending the weighted audio stream and the datum via 
the communication network. 

According to a seventh aspect of the present 
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invention, there is provided a method o£ creating audio 
information £or use in an audio scene for an avatar in a 
virtual environment, the method comprising the steps of: 

creating £ui unweighted audio stream that coznprises 
5 audio from an object located in a portion of a hearing 
range of the avatar; and 

associating the unweighted audio stream with a datum 
that represents an approximate location of the object in 
the virtual environment, wherein the unweighted audio 
10 stream and the datum represent the audio information. 

Preferably, the step of creating the unweighted audio 
stream is carried out in accordance with a predetermined 
mixing operation, wherein the predetermined mixing 
15 operation comprises identification information that 
identifies the object. 

Preferably, the method further coxnprises the steps 

of: 

20 receiving the audio and the predetermined mixing 

operation via a communication network; and 

sending the unweighted audio stream and the datum via 
the communication network. 

25 According to a eighth aspect of the present invention 

there is provided a method of obtaining information that 
can be used to create an audio scene for €ui avatar in a 
virtual environment, the method comprising the steps of: 
determining an identifier of an object located in a 
30 portion of a hearing range of the avatar; 

determining a weighting to be applied to audio from 
the object; and 

determining a location of the portion in the virtual 
environment, wherein the identifier, weighting and the 
35 location represent the information that can be used to 
create an audio scene. 
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Pre£er6Q>ly, the method £urther comprises the step of 
sending, via a communication network, the identifier, the 
weighting and the location to one of a plurality of systems 
for processing, 

5 

Preferably, the method further comprises the step of 
creating routeing information for the communication 
network, wherein the routeing information is such that it 
can be used by the communication network to route the audio 
10 to the one of the plurality of system for processing. 

Preferably, the steps of determining the identifier, 
the weighting and the location respectively comprise 
determining the identifier, the weighting and the location 
15 by processing a representation of the virtual environment. 

Preferably, the method further comprises the 
following steps to determine the portion of the hearing 
range: 

20 selecting a first of a plurality of avatars in the 

virtual environment; 

identifying a second of the plurality of avatars that 
is proximate the first of the avatars; 

determining whether the second of the avatars can be 
25 included in an existing cluster; 

including the second of the avatars in the existing 
cluster upon determining that it can be included therein; 

creating a new cluster that includes the second of 
the avatars upon determining that the second of the avatars 
30 cannot be included in the existing cluster to thereby 
create a plurality of clusters; 

determining an angular gap between two of the 
clusters; 

creating a further cluster that is located in the 
35 euigular gap; and 

including at least one of the avatars in the further 
cluster. 
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Alternatively, the method coxnprises the following 
steps to determine the position o£ the hearing range: 
selecting one of a plurality of avatars in the 
5 virtual environment; 

determining a radial ray that extends from the avatar 
to the one of the plurality of avatars; 

calculating the absolute angular distance that each 
of the plurality of avatars is from the radial ray; 
10 arranging the absolute angular distance of each of 

the avatars into an ascending ordered list; 

calculating a differential angular separation between 
successive ones of the absolute angular distemce in the 
ascending ordered list; and 
15 selecting at least one of the differential angular 

separation that has a higher value than cuiother 
differential angular separation; and 

determining another radial ray that emanates from the 
avatar and which bisects two of the avatars that are 
20 associated with the differential angular separation. 

According to a ninth aspect of the present invention 
there is provided a method of creating information that can 
be used to create an audio scene for an avatar in a virtual 
25 environment, the method comprising the steps of: 

determining an identifier of an object located in a 
portion of a hearing range of the avatar; and 

determining an approximate location of the object in 
the virtual environment, wherein the identifier and the 
30 approximate location represent the information that can be 
used to create the audio scene. 



Preferably, the. method further comprises the step of 
sending, via a communication network, the identifier and 
35 the location to one of a plurality of systems for 
processing. 
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Preferably, the method further comprises the step of 
creating routeing information for the communication 
network, wherein the routeing information is such that it 
can be used by the communication network to route the audio 
5 to the one of the plurality of systems for processing. 

Preferably, the steps of determining the identifier 
6uid the approximate location respectively comprise the step 
of determining the identifier and the location by 
10 processing a representation of the virtual environment. 

Preferably, the method further comprises the 
following steps to determine the approximate location of 
the object: 

15 dividing the virtual environment into a plurality of 

cells; and 

determining a location in one of the cells about 
which the object is located. 

20 According to a tenth aspect of the present invention 

there is provided a method of rendering an audio scene for 
an avatar in a virtual environment, the method comprising 
the steps of: 

obtaining a weighted audio stream that comprises 

25 audio from an object located in a portion of a hearing 

range of the avatar, and a datum that is associated with 
the weighted audio stream and which represents a location 
of the portion of the hearing range in the virtual 
environment; and 

30 processing the weighted audio stream and the datum in 

order to render the audio scene. 

According to an eleventh aspect of the present 
invention there is provided a computer program coznprising 
35 at least one instruction for causing a coxnputing device to 
carry out the method according to the sixth, seventh, 
eight, ninth or tenth aspect of the present invention. 
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According to a twelfth aspect o£ the present 
invention there is provided a computer readable mediiun 
comprising the computer program according to the eleventh 
5 aspect o£ the present invention. 

BRIEF DESCRIPTION OF THE DRAWINGS 

Notwithstanding any other embodiments that may fall 
10 within the scope of the present invention, an eznbodiment of 
the present invention will now be described, by way of 
example only, with reference to the accompanying figures, 
in which: 

15 figure 1 provides a block diagram of a system in 

accordance with the embodiment of the present invention; 

figure 2 provides a flow chart of various steps 
performed by the system shown in figure 1; 

20 

figure 3 provides a flow chart of the steps involved 
in a grid summarisation algorithm used in the system shown 
in figure 1; 

25 figure 4 illustrates a map used by the system shown 

in figure 1; 

figure 5 illustrates a control table used by the 
system shown in figure 1; 

30 

figure 6 provides a flow chart of the steps involved 
in a cluster sxunmarisation algorithm used in the system 
shown in figure 1; 

35 figure 7 is an illustration of the clusters formed 

using the algorithm of figure 6; 
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£igure 8 is a £low chart o£ the various steps 
involved in an alternative clustering algorithm; 

figure 9 provides a visual depiction of the result of 
5 running the alternative clustering algorithm of figure 8 on 
the map shown in figure 4; 

figure 10 illustrates another control table used by 
the system shown in figure 1; 

10 

figure 11 provides a flow chart of the steps involved 
in a process performed by the system shown in figure 1; 

figure 12 provides a flow chart of the steps involved 
15 in a process performed by the system shown in figure 1. 

AN EMBODIMENT OF THE INVENTION 

With reference to figure 1, which illustrates a 
20 system 101 embodying the present invention, the system 101 
comprises: an audio scene creation system 103; a virtual 
environment state maintenance system 105; and a client 
computing device 107. The system 101 also comprises a 
communication network 109. The audio scene creation system 
25 103, the virtual environment state maintenance system 105 
and the client coznputing device 107 are connected to the 
communication network 109 and arranged to use the network 
109 in order to operate in a distributed manner; that is, 
exchange information with each other via the communication 
30 network 109. The communication network 109 is in the form 
of a public access packet switched network such as the 
Internet, cuid is therefore made up of nximerous interconnect 
routers (not shown in the figures) • 

35 Generally speaking, the virtual environment state 

maintenance system 105 is arranged to maintain dynamic 
state information pertaining to a virtual environment (such 
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as a battlefield) • The dynamic state information maintained 
by the system 105 includes, for example, the location of 
various avatars in the virtual environment and, where the 
virtual environment relates to a game, individual players' 
5 scores. The audio scene creation system 103 is basically 
arranged to create and manage the real-time audio related 
aspects of participamts in the virtual environment (such as 
the participants voice); that is, create and manage audio 
scenes. The client computing device 107 is essentially 
10 arranged to interact with the virtual environment state 

maintenance system 105 and the audio scene creation system 
103 to allow a person using the client computing device 107 
to participate in the virtual environment. 

15 More specifically, the graphical environment state 

maintenance system 105 is in the form of a computer server 
(or in an alternative enibodiment, a plurality of 
distributed computer servers interconnected to each other) 
that contprises traditional computer hardware such as a 

20 motherboard, hard disk storage, and random access memory. 
In addition to the hardware the computer server also 
coxqprises an operating system (such as Linux or Microsoft 
Windows) that performs various system level operations (for 
example, memory management) • The operating system also 

25 provides an environment for executing application software. 
In this regard, the computer server comprises an 
application package that is loaded on the hard disk storage 
and which is capable of maintaining the dynamic state 
information pertaining to the virtual environment. In this 

30 regard, if the virtual environment was, for example, a 

battlefield then the dynamic state information may indicate 
that a particular avatar (which, for example, represents a 
soldier) is situated in a tank. The virtual environment 
state maintenance system 105 essentially comprises two 

35 modules 111 and 113 in the form of software. The first of 
the modules 111 is essentially responsible for sending and 
receiving the dynamic state information (pertaining to the 
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virtual enviroiiznent ) to/ from the client computing device 
107. The second o£ modules 113 is arranged to send the 
dynamic state information to the audio scene creation 
system 103. 

5 

As mentioned previously, the audio scene creation 
system 103 is basically arranged to create and manage audio 
scenes. Each audio scene basically represents a realistic 
reproduction of the sounds that would be heard by an avatar 

10 in the virtual environment. In order to create the audio 
scenes, the audio scene creation system 103 comprises a 
control server 115, a summarisation server 117 (alternative 
embodiments of the present invention may include a 
plurality of distributed summarisation servers), and a 

15 plurality of distributed scene creation servers 119. The 
control server 115, the summarisation server 117 and the 
plurality of distributed scene creation servers 119 are 
connected to the communication network 109 and use the 
communication network 109 to cooperate with each other in a 

20 distributed fashion. 

The control server 115 is in the form of a computer 
server that comprises traditional computer hardware such as 
a motherboard, hard disk storage, and random access memory. 

25 In addition to the hardware the computer server also 

comprises an operating system (such as Linux or Microsoft 
Windows) that performs various system level operations. The 
operating system also provides an environment for executing 
application software. In this regard, the computer server 

30 comprises application software that is loaded on the hard 
disk storage and which is arranged to carry out the various 
steps of the flow chart 201 shown in figure 2. The first 
step 203 that the application software performs is to 
interact with the virtual environment state maintenance 

35 system 105 to obtain the dynamic state information 

pertaining to the virtual environment. The application 
software obtains and processes the dynamic state 
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information in order to identify the various avatars 
present in the virtual environment and the location of the 
avatars in the virtual environment. The virtual environment 
state maintenance system 105 can also process the dynamic 
5 state information to obtain details of the status of the 
avatars (for exanQ)le, active or inactive) and details of 
€uiy sound barriers. To obtain the dynamic state information 
the application software of the control server 115 
interacts with the second of the modules 113 in the virtual 
10 environment state maintenance system 105 via the 
communication network 109. 

Once the application software of the control server 
115 has obtained the dynamic state information from the 

15 virtual environment state maintenance system 105, it 

proceeds to process the dynamic state information in order 
to create a number of mixing operation that are processed 
by the summarisation server 117 and scene creation servers 
119 in order to create audio scenes for each avatar in the 

20 virtual environment. Following on from the initial step 203 
the control server 115 performs the step 205 of running a 
grid summarisation algorithm. With reference to figure 3, 
which shows a flow chart 301 of the grid summarisation 
algorithm, the first step 303 of the grid summarisation 

25 algorithm is to use the dynamic state information obtained 
during the initial step 203 to form a map 401, which can be 
seen in figure 4, of the virtual environment. The map 401 
is divided into a plurality of cells and depicts the 
location of the avatars in the virtual enviroximent • The map 

30 401 depicts the avatars as the small black dots. Whilst 
the present embodiment includes only a single map 401, it 
is envisaged that multiple maps 401 could be employed in 
alternative exnbodiments of the present invention. 

35 It is noted that each avatar in the virtual 

environment is considered to have a hearing range that is 
divided into an interactive zone and a backgroiuid zone. The 
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interactive zone is generally considered the section o£ the 
hearing range immediately surrounding the avatar, whilst 
the background zone is the section of the hearing range 
that is located around the periphery (outer limits) of the 
5 hearing range. As an example, the interactive zone of a 

hearing reuige of an avatar in shovm in figure 4 as a circle 
surrounding the avatar. 



In forming the map 401, the application software of 
10 the control server 115 ensures that the size of each cell 
is greater than or equal to the interactive zone of the 
avatars • 



The next step 305 performed when carrying out the 

15 grid summarisation algorithm is to determine a ^centre of 
mass' of each of the cells in the map 401. The centre of 
mass is basically determined by identifying the point in 
each cell around which the avatars therein are centred. The 
centre of mass can be considered an approximate location of 

20 the avatars in the virtual environment. The final step 307 
in the grid summarisation algorithm is to update a control 
t€J3le 501 (which is shown in figure 5) used by the 
summarisation server 117 based on the map 401. The control 
table 501 comprises a plurality of rows, each of which 

25 represents one of the cells in the map 401. Each row also 
contains an identifier of each avatar in the respective 
cell and the centre of mass thereof. Each row in the 
control table 501 can effectively be considered a 
unweighted mixing operation. In order to update the control 

30 table 501 the application software of the control server 
115, interacts with the summarisation server 117 via the 
communication network 109. 



Once the application software of the control server 
35 115 has completed the step 205 of running the grid 

summarisation algorithm, the next step 207 it performs is 
to min a cluster summarisation algorithm. Figure 6 provides 
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a flow chart 601 of the various steps involved in the 
cluster summarisation algorithm. The first step 603 of the 
cluster summarisation algorithm is to select a first of the 
avatars in the virtual envirozmient • Following on from the 
5 first step 603 the cluster summarisation algorithm involves 
the step 605 of selecting a second of the avatars that is 
closest to the first of the avatars, which was selected 
during the first step 603. Once the second of the avatars 
has been selected, the cluster summarisation algorithm 

10 involves the step 607 of determining whether the second of 
the avatars fits in to a previously defined cluster. 
Following on from the previous step 607 the cluster 
summarisation algorithm involves the step 609 of placing 
the second of the avatars in to the previously defined 

15 cluster if it fits therein. On the other heuid if it is 

determined that the second of the avatars does not fit in 
to a previously defined cluster then the cluster 
summarisation algorithm involves carrying out the step 611 
of esteO^lishing a new cluster that is centred around the 

20 second of the clusters. It is noted that the preceding 

steps 603 to 611 are performed until a predetermined number 
of clusters M are established. 



Once the M clusters have been established, the 
25 cluster summarisation algorithm involves performing the 

step 613 of finding the largest angular gap between the M 
clusters. Once the largest angular gap has been determined 
the cluster summarisation algorithm involves the step 615 
of establishing a new cluster in the largest cmgular gap. 
30 The previous steps 613 and 615 are repeated until a total 
of K clusters have been established. It is noted that the 
niunber of M clusters is ^ the number of JC clusters. 



The final step 617 of the cluster summarisation 
35 algorithm involves placing all remaining avatars within the 
best of the K clusters, which are those clusters that 
result in the least angular error; that is, the angular 
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difference between where a soiind source is rendered from 
the perspective of the first of the avatars and the actual 
location of the sound source if the sound from the source 
was not summarised. 

5 

Once the steps 603 to 617 of the cluster 
summarisation algorithm have been performed the application 
software rxinning on the control server 115 proceeds to 
carry out the last step 209, which is discussed in detail 
10 in subsequent paragraphs of this specification. An 

illustration of the clusters established using the cluster 
summarisation algorithm is shown in figure 7. 

Persons skilled in the art will readily appreciate 
15 that the present invention is not limited to being used 
with the aforementioned clustering algorithm. By way of 
example, the following describes an alternative clustering 
algorithm that can be employed in another embodiment of the 
present invention. The flow chart 807 in figure 8 shows 
20 the steps involved in the alternative clustering algorithm. 

The first step 803 of the alternative cluster 
summarisation algorithm is to select one of the avatars in 
the virtual environment. The next step 805 is to then 

25 determine the total number of avatars and grid summaries 
that are located in the hearing range of the avatar. The 
grid summaries are essentially unweighted audio streams 
produced by the summarisation server 117. A detailed 
description of this aspect of the summarisation server 117 

30 is set out in subsequent paragraphs of this specification. 

Following on from the previous step 805, the next 
step 807 is to assess whether the total number of avatars 
and grid summaries in the hearing range is less than or 
35 equal to K, which is a number selected based on the amount 
of bandwidth available for transmitting an audio scene. If 
it is determined that the total nuznber of avatars and grid 
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sxixnmaries is less than or equal to K, then the application 
software running on the control server 115 proceeds to the 
final step 209 of the algorithm (which is discussed in 
subsequent paragraphs of this specification) • 

5 

In the event that the total number of avatars and/or 
grid summaries in the hearing range is greater than K, the 
control server 115 continues to carry out the alternative 
cluster summarisation algorithm. In this situation the next 

10 step 809 in the alternative cluster summarisation algorithm 
is to effectively plot on the map 401 a radial ray that 
emanates from the avatar (selected during the previous step 
803) and goes through any of the other avatars in the 
hearing range of the avatar. Subsequent to step 809, the 

15 next step 811 is to calculate the absolute angular distance 
of every avatar and grid summary in the hearing range of 
the avatar. Following on from step 811 the alternative 
clustering algorithm involves the step 813 of arranging the 
absolute angular distemces in an ascending ordered list. 

20 The next step 815 is to calculate the differential angular 
separation of each two successive absolute emgular 
disteuices in the ascending ordered list. Once the previous 
step 815 has been carried out, the next step 817 is to 
identify the K largest differential angular distances. The 

25 next step 819 is to divide the hearing range of the avatar 
into K portions by effectively forming radial rays between 
each of the avatars that are associated with the K highest 
differential angular distances. The area between the radial 
rays is referred to as a portion of the hearing range. 

30 Figure 9 depicts the effect of running the alternative 
cluster summarisation algorithm on the map 401. 

As an example of the previous steps of the 
alternative cluster summarisation algorithm, consider a 
35 virtual environment comprising a total of 10 avatars/grid 
sximmaries, and a K that equals 4. Assume that the initial 
steps 811 and 813 of the alternative cluster summarisation 

{00762392.1} 



- 24 - 



algorithm result in the following list of absolute angular 
distances in ascending ordered: 

0, 10, 16, 48, 67, 120, 143, 170, 222 and 253, which 
5 correspond respectively to avatars/grid summaries Aq to As. 

The subsequent step 815 of the alternative cluster 
sximmarisation algorithm which involves calculating the 
differential angular separation of each two successive 
10 absolute emgular distances in the above list will result in 
the following: 

10, 6, 32, 19, 53, 23, 27, 52, 31 and 107 

15 The step 817 of the alternative cluster summarisation 

algorithm which involves identifying the K (4) largest 
differential angular distances will result in the following 
being selected: 

20 107, 53, 52 and 32 

The step 819 of the alternative cluster summarisation 
algorithm which involves dividing the hearing ranging into 
portions will result in the following K (4) clusters of 
25 avatars being defined: 

1: Ao, Ai and A3 

2: A3 and A4 

3: As/ As and A7 

30 4: As and A9 

Following on from the previous steps, the alternative 
cluster summarisation algorithm involves the step 821 of 
determining the locations of the avatars in the virtual 
35 environment. The application software running on the 

control server 115 does this by interacting with the second 
of the modules 113 in the virtual environment state 
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maintencuice system 105. Once the location o£ the avatars 
has be determined, the alternative cluster siumnarisation 
algorithm involves the step 823 o£ using the locations o£ 
the avatars to determine a distances between the avatars 
5 and the avatar £or which the alternative cluster 

summarisation algorithm is being ziin. Subsequent to the 
step 823 the alternative cluster summarisation algorithm 
involves the step 825 o£ using the distances to determine a 
weighting to be applied to audio emanating £rom the avatars 
10 in the hearing range o£ the avatar. The step 825 also 

involves the step o£ using the centre o£ mass (determined 
£rom the grid summarisation algorithm) to determine a 
weighting £or each o£ the grid summaries in the hearing 
range o£ the avatar. 

15 

At this stage, the alternative cluster summarisation 
algorithm involves the step 827 o£ determining a centre o£ 
mass £or each of the portions o£ the hearing range 
identified during the previous step 819 of dividing up the 
20 hearing remge. As with the grid summarisation algorithm, 
the alternative cluster summarisation algorithm determines 
the centre of mass by selecting a location in each of the 
portions around which the avatars are centred. 

25 The final step 829 of the alternative cluster 

summarisation algorithm involves updating a control table 
1001 (which is shown in figure 10) in the scene creation 
servers 119. This involves updating the control teOdles 1001 
to include the identifier of each of the avatars in the 

30 portions of the hearing range, the weightings to be applied 
to the avatars in the portions, and the centre of mass of 
each of the portions. It is noted that the control server 
115 updates the control table 1001 in the scene creation 
server 119 via the communication network 109. 

35 

As can be seen in figure 10, the control table 1001 
in the scene creation servers 119 comprises a plurality of 
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rows. Each o£ the rows corresponds to a portion o£ the 
hearing range o£ an avatar and contains the identifiers o£ 
the avatars/grid stumnaries (Sb and Zi, respectively) in 
each portion o£ the hearing range. Each row o£ the control 
5 t€Q3le 1001 also comprises the weighting to be applied to 
audio £rom the avatars/grid suinmaries (W) , and the centre 
of mass of the portions, (which is contained in the 

Location Coord'' column of the control table 801) • The 
centre of mass is in the form of x, y coordinates. 

10 

Upon completing the final step 829 of the alternative 
cluster sumonnarisation algorithm, the application software 
running on the control server 115 proceeds to carry out its 
last step 209. The last step 209 involves interacting with 

15 the communication network 109 to estsddlish specific 

communication links. The communication links are such that 
that they enable audio to be transferred from the client 
coznputing device 107 to the summarisation server 117 and/or 
the scene creation servers 119, and grid summaries 

20 (unweighted audio streams) to be transferred from the 

summarisation server 117 to the scene creation servers 119. 

Once the control server 115 has completed the 
previous steps 203 to 209, the summarisation server 117 is 

25 in a position to create unweighted audio streams (grid 

summaries) • The summarisation server 117 is in the form of 
a computer server that comprises traditional computer 
hardware such as a motherboard, hard disk storage means, 
and random access memory. In addition to the hardware the 

30 computer server also comprises an operating system (such as 
Linux or Microsoft Window) that performs various system 
level operations. The operating system also provides an 
environment for executing application software. In this 
regard, the computer server cozqprises application software 

35 that is arranged to carry out a mixing process, the steps 
of which are shown in the flow chart 1101 illustrated in 
figure 11, in order to create unweighted audio streams. 
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The first step 1103 o£ the flow chart 1101 is 
to obtain the audio streams Sn associated with each of the 
avatars identified in the ^^Streams to be mixed''' column of 
5 the control table 501 in the summarisation server 117. The 
control table 501 being illustrated in figure 5. It is 
noted that the summarisation server 117 obtains the audio 
streams Sn via the communication network 109. In this 
regard, the previous step 209 of the control server 115 

10 interacting with the communication network 109 established 
the necessary links in the communication network 109 to 
enable the summarisation server 117 to receive the audio 
streams Sn* Then for each row in the control table 501, the 
next step 1105 is to mix together the identified audio 

15 streams Snf to thereby produce M mixed audio streams. Each 
of the M mixed audio streams comprises the audio streams Sn 
identified in the ^^St reams to be mixed'' column of each of 
the M rows in the control table 501. When mixing the audio 
streams Sn during the mixing step 1105 each audio stream Sn 

20 is such that they have their original unaltered amplitude. 
The M mixed audio streams are therefore considered 
unweighted audio streams. As indicated previously, the 
luiweighted audio streams contain audio from the avatars 
located in the cells of the map 401, which is shown in 

25 figure 4. 

The next step 1107 in the flow chart 1101 is to tag 
the unweighted audio streaxas with the corresponding centre 
of mass of the respective cell in the map 401. This step 

30 1107 effectively involves inserting the x, y coordinates 
from the ''^centre of mass of the cell" columns of the 
control table 501. The final step 1109 in the process 1101 
is to forward the unweighted audio streams from the 
summarisation server 117 to the appropriate scene creation 

35 server 119, which is achieved by using the communication 
network 109 to transfer the unweighted audio streams from 
the summarisation server 117 to the scene creation server 
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119. The previous step 209 o£ the control server 115 
interacting with the communication network 109 established 
the necessary links in the communication network 109 to 
enable the unweighted audio streams to be transferred from 
5 the sxumnarisation server 117 to the scene creation server 
119. 

Once the unweighted audio streams have been 
transferred to the scene creation server 119 it is in a 
position to carry out a mixing process to create weighted 
audio streams. The steps involved in the mixing process are 
shown in the flow chart 1201 of figure 12. Each scene 
creation server 119 is in the form of a computer server 
that comprises traditional computer hardware such as a 
motherboard, hard disk storage means, €uid random access 
memory. In addition to the hardware the computer server 
also comprises an operating system (such as Linux or 
Microsoft Window) that performs various system level 
operations. The operating system also provides an 
environment for executing application software. In this 
regard, the computer server conv^^ises application software 
that is arranged to carry out the various steps of the flow 
chart 1201. 

25 The steps of the flow chart 1201 are essentially the 

same as the steps of the flow chart 1101 carried out by the 
summarisation server 117, except that instead of producing 
an unweighted audio stream the steps of the latter flow 
chart 1201 result in weighted audio streams being created. 

30 As can be seen in figure 12 the first step 1203 involves 
obtaining the audio streams Zi and Sn identified in the 
control table 1001 of the scene creation server 119, where 
Zi is an unweighted audio stream from the sumooiarisation 
sezrver 117 and Sn is an audio stream associated with a 

35 particular avatar. Then, for each row in the control table 
1001, the flow chart 1201 involves the step 1205 of mixing 
the audio streams Zi €uid Sn identified in the ^^Cluster 

{00762392.1) 



10 



15 



20 



- 29 - 



summary streams'' o£ the control table 1001, to thereby 
produce weighted audio streams. Each o£ the weighted audio 
streams comprises the audio streams Zi and Sn identified in 
the corresponding row o£ the control table 1001. Unlike the 
5 unweighted audio streams created by the summarisation 

server 117, the amplitude o£ the audio streams Zi and Sn in 
the weighted audio streams have different amplitudes. The 
amplitudes are determined during the mixing step 1205 by 
effectively multiplying the audio streams Zi and Sn by 
10 their associated weightings Wn, which are also contained in 
the ^^Cluster summary streams'' column of the control table 
1001. 

The next step 1207 in the flow chart 1201 is to tag 
15 the weighted audio streams with the center of mass 

contained in the corresponding ^^Location Coord" column of 
the control teJ^le 1001. This effectively involves inserting 
the X, y coordinates contained in the ^^Location Coord" 
column. The final step 1209 of the flow chart 1201 is to 
20 forward, via the communication network 109, the weighted 
audio streams to the client computing device 107 for 
processing. 

The client computing device 107 is in the form of a 
25 personal computer comprising typical coxnputer hardware such 
as a motherboard, hard disk and memory. In addition to the 
hardware, the client computing device 107 is loaded with an 
operating system (such as Microsoft Windows) that manages 
various system level operations and provides an environment 
30 in which application software can be executed. The client 

computing device 107 also comprises: an audio client 121; a 
virtual environemnt client 123; and a spatial audio rending 
engine 125. The audio client 121 is in the form of 
application software that is arranged to receive and 
35 process the weighted audio streams from the scene creation 
servers 119. The spatial audio rending engine 125 is in the 
form of audio rending software and soundcard. On receiving 

{00762392.1} 



- 30 - 



the weighted audio streams £rom the scene creation server 
119, the audio client 121 interacts with the spatial audio 
rending engine 125 to render (reproduce) the weighted audio 
^ streaias and thereby create an audio scene to the person 
5 using the client computing device 107. In this regard, the 
spatial audio rending engine 125 is coxmected to a set o£ 
speakers that are used to convey the audio scene to the 
person. It is noted that the audio client 121 extracts the 
location information inserted into the weighted audio 

10 stream by a scene creation server 119 during the previous 
step 1207 o£ tagging the weighted audio streams. The 
extracted location information is conveyed to the spatial 
audio rending engine 125 (along with the weighted audio 
streams), which in turn uses the location information to 

15 reproduce the information as if it was emanating from the 
location; that is, for example from the right hemd side. 

The virtual environment client 123 is in the form of 
software (and perhaps some dedicated image processing 

20 hardware in alternative embodiments) and is basically 

arranged to interact with the first of the modules 111 of 
the virtual environment state maintenance system 105 in 
order to obtain the dynamic state information pertaining to 
the virtual environment. On receiving the dynamic state 

25 information the graphics client 123 process the dynamic 
state information to reproduce (render) the the virtual 
environment. To enable the virtual environment to be 
displayed to the person using the client computing device 
107, the client computing device 107 also comprises a 

30 monitor (not shown). The graphics client 123 is also 
arranged to provide the virtual environment state 
maintenance system 105 with dynamic information pertaining 
to the person's presence in the virtual environment. 

35 Those skilled in the art will appreciate that the 

invention described herein is susceptible to variations and 
modifications other than those specifically described. It 
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should be understood that the invention includes all such 
variations €uid modifications which fall within the spirit 
and scope of the invention. 
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